Tool calling, vision, thinking traces, reaction UX, threaded replies#30
Merged
Conversation
Drops the chatgpt package and the openai SDK in favor of provider-native
clients. The OpenAI chat/completions wire format was a 2023-vintage
lowest-common-denominator that blocked modern features (thinking, native
tools, structured output, vision, free telemetry); none of which the bot
uses today, but none of which it could ever use through the compat shim.
Architecture changes:
- New lib/chat-backends.js exposes makeOllamaChat() and makeGeminiChat()
factories. Each returns a uniform { chat({ messages }) -> { text } }
interface. Adapters bind model + system message at construction and
own all wire-format translation; handleMessage is backend-agnostic.
- lib/chat.js drops the chatgpt-specific parentId Map. Conversation
history is now persisted in Keyv as a per-user [{role, content}]
array under "convo:<userId>", so chats survive process restarts.
History is trimmed to historyLimit (default 20 messages) per turn.
- lib/deps.js picks the backend via CHAT_BACKEND (default "ollama").
Reuses the existing @google/genai client for the Gemini chat path so
we don't construct two of them.
Env changes:
- Drop OPENAI_API_KEY, LLM_API_BASE_URL, LLM_MODEL, MEMORY_MAX_KEYS.
- Add CHAT_BACKEND, OLLAMA_HOST, OLLAMA_MODEL, GEMINI_CHAT_MODEL.
Verified end-to-end against http://kepler.local:11434 with gemma4:31b:
single-turn replies in character; two-turn conversation correctly
recalled the user's earlier statement via the persisted history.
36 tests pass (8 new for the adapter wire-format translation), lint
clean, syntax-check clean on all source files.
Co-Authored-By: Claude Opus 4.7 <[email protected]>
Native Ollama + Gemini chat backends, replacing the OpenAI chat/completions compatibility shim. Conversation history now owned by the bot and persisted in Redis so it survives restarts. Co-Authored-By: Claude Opus 4.7 <[email protected]>
Defines a small tool registry (lib/tools.js) that Data can choose from on
each turn: generate_image, tell_dad_joke, state_asimovs_laws, show_help,
start_dance_party, play_rickroll, play_tiktok. handleMessage gains a
dispatch loop (capped at 5 iterations) that runs tool calls and feeds
results back as tool-role messages until the model returns plain text.
Both chat adapters now translate the canonical { tools, toolCalls } shape
to/from their native wire format:
- Ollama: { type: 'function', function: {...} } and tool_calls on the
assistant message; tool-role results carry tool_name.
- Gemini: functionDeclarations under config.tools, functionCall parts
in candidates, functionResponse parts for tool results.
Tool round-trip traffic stays ephemeral inside the dispatch loop — only
the final assistant text is persisted to convoStore so conversation
history remains a clean user/assistant alternation.
Per-message tool registries are built in app.js so closures bind the
right Slack channel for upload + post side effects. The makeTools
factory accepts rng and sleep so tests can pin the dad-joke zinger
without 10-second waits.
Drops the isImageRequest matcher + IMAGE_REQUEST_GUIDANCE shim entirely
— Data now just generates the image when asked instead of telling the
user to type /image.
Verified live against gemma4:31b on kepler.local:
- "Draw me a picture of a sehlat." -> tool call with a rich prompt,
upload posted, reply "I have generated an image of a sehlat for you."
- "What is 2 + 2?" -> "Four." (no tool calls when not needed)
53 tests pass (17 new across chat/chat-backends/tools/responses), lint
clean.
Closes #25
Co-Authored-By: Claude Opus 4.7 <[email protected]>
Detects image attachments on incoming Slack messages, fetches each via
the Slack file URL with the bot token, base64-encodes and attaches to
the user turn in the canonical chat shape as
`images: [{mimeType, data}]`.
Both adapters translate to their native wire format:
- Ollama: Message.images is a list of base64 strings (mimeType ignored).
- Gemini: parts get { inlineData: { mimeType, data } } interleaved with
the text part.
Image bytes are NOT persisted to convoStore — only the text portion of
the user turn lands in history. The Slack file URLs would expire even
if we did, and the bytes are big.
Empty text plus images is now a valid turn (the model handles "what
is this?" implicitly). Truly empty messages (no text, no files) still
short-circuit before reaching the LLM.
In app.js, both the DM/MPIM and direct-mention handlers now run
`extractMessageImages` before calling handleMessage, gated on the
message having `files` of `image/png|jpeg|webp|gif` mimetype. Failed
file fetches log a warning and skip.
7 new tests covering handleMessage image attach + persistence stripping,
empty-text-with-images, and both adapter translations.
Closes #26
Co-Authored-By: Claude Opus 4.7 <[email protected]>
handleMessage now returns { text, thinking? } instead of a bare string.
When the Ollama backend emits message.thinking (qwq, gpt-oss,
deepseek-r1, etc.), app.js renders the reply as a Slack Block Kit
message with a context block showing an italicized "Thinking: …"
trace above the actual reply text.
Wiring:
- New OLLAMA_THINK env var. Accepts true/false (or 1/0/yes/no) and
low/medium/high. Parsed in deps.js, passed to makeOllamaChat at
construction time.
- Ollama adapter forwards the think param on every chat() call and
surfaces response.message.thinking as result.thinking.
- handleMessage carries the trace through the tool dispatch loop —
only the final non-tool response's thinking is surfaced.
- Thinking traces are NOT persisted to convoStore: ephemeral metadata,
same as tool call traffic.
- Long traces are clipped to 1200 chars in the Slack context block to
keep messages readable.
Gemini support for thinking deferred — gemini-3-flash-latest has a
thinking field but its API shape is still settling; can be added to
makeGeminiChat without further changes to handleMessage.
6 new tests across handleMessage + Ollama adapter (think param wiring,
trace surfacing, persistence stripping).
Closes #27
Co-Authored-By: Claude Opus 4.7 <[email protected]>
The post-then-delete dance for the "thinking" indicator created message churn, occasional delete failures to log around, and ugly Slack history. Reactions are the right primitive for "I'm working on this" — Slack ephemeralizes them for free. - `addThinkingReaction` / `removeThinkingReaction` replace postThinking / clearThinking. Both handlers add 🧠 to the user's message and remove it before replying. - manifest.yaml: adds reactions:read and reactions:write scopes. - Drops the THINKING_MESSAGE env var, lib/deps.js thinkingMessage field, and matching field destructure in app.js — no longer needed. - Reply path simplifies: no more `let thinking = null` mutation, no post-and-track ts threading through the try/catch. Closes #28 Co-Authored-By: Claude Opus 4.7 <[email protected]>
Channel @-mentions used to bail entirely on threaded messages, and top-level mentions replied at the channel root — which floods the channel for any back-and-forth. Now Data threads its replies: - @DaTa at top level -> reply starts a thread rooted at the mention - @DaTa inside an existing thread -> reply continues that thread - Tool side effects (image uploads, joke posts, asimov recitation, etc.) also land in-thread because buildToolsFor plumbs thread_ts through into chat.postMessage and files.uploadV2. Bail conditions in the direct-mention handler are tightened to just { bot_profile, bot_id, edited } — message.thread_ts and message.parent_user_id no longer cause Data to drop the message, so follow-up @-mentions inside threads Data started are now answered. The bot_id/bot_profile guard still prevents loops with other bots. DMs are unchanged — they reply flat as before. Closes #29 Co-Authored-By: Claude Opus 4.7 <[email protected]>
scarolan
added a commit
that referenced
this pull request
Jun 14, 2026
…constraint PR #30 added a tool registry and dispatch loop to let the LLM decide when to fire features like image generation. In practice it didn't earn its keep: - 6 of the 7 tools (tell_dad_joke, state_asimovs_laws, show_help, start_dance_party, play_rickroll, play_tiktok) duplicated regex matchers that already worked perfectly — faster, model-agnostic. - The one tool that genuinely benefited from LLM understanding (generate_image) is unreliable on gemma4:31b and gemma4-openhands: both emit `<call:generate_image{...}><tool_call|>` as plain text instead of populating Ollama's structured tool_calls field. That's the gemma family wire-format, not a fixable prompting issue. - Other models that tool-call reliably (qwen3.6:27b) aren't vision- capable. Picking between "tools work but no vision" and "vision works but no tools" is a false choice — vision is the real value- add, and /image already handles image generation cleanly. Rolled back: - lib/tools.js and test/tools.test.js deleted. - lib/chat-backends.js: removed tools/toolCalls translation from both Ollama and Gemini adapters. They're back to messages/text/thinking. - lib/chat.js: dispatch loop gone. handleMessage is a single chat call with persistence. - app.js: dropped buildToolsFor and the tools param to handleMessage. - lib/responses.js: restored isImageRequest + IMAGE_REQUEST_GUIDANCE. - app.js: both handlers now nudge users toward /image when they ask for an image in chat (instead of dispatching a tool that gemma can't reliably call). Kept (still earning their keep): - Vision via message.images — works great with gemma4. - 🧠 reaction UX from #28. - Thread-aware @-mention replies from #29. - Thinking traces from #27 — captured but not rendered. 46 tests pass (down from 65 after dropping tool-specific suites), lint clean. Co-Authored-By: Claude Opus 4.7 <[email protected]>
scarolan
added a commit
that referenced
this pull request
Jun 14, 2026
Adds a "Model selection notes" section to CLAUDE.md documenting: - gemma4:31b: great vision, broken structured tool calls - qwen3.6:27b: reliable tools, no vision - What's pulled on kepler.local - Why tool calling was removed (6/7 tools were regex-duplicates) - Slack file_share subtype gotcha for vision - reactions:read/write scope + manifest reinstall requirement - Why thinking-trace rendering was suppressed despite the plumbing being kept So future sessions don't have to relitigate any of it. Co-Authored-By: Claude Opus 4.7 <[email protected]>
This was referenced Jun 14, 2026
Owner
Author
|
Main carries this work directly — closing the PR. After this branch merged, real-Slack testing surfaced that gemma4 (the model we actually want to run, for its strong vision) emits tool calls as inline text instead of populating Ollama's structured The current shape of main:
Closing. |
scarolan
added a commit
that referenced
this pull request
Jun 15, 2026
…ment
- Default OLLAMA_MODEL is now gemma4:31b (production's actual default),
not llama3.1. Aligned README, .env.example, CLAUDE.md.
- Remove the thinking-trace subsystem end to end: the OLLAMA_THINK /
think knob, message.thinking capture in the Ollama adapter, the
{ thinking? } return field through handleMessage, and the
buildReplyPayload shim. Rendering was already suppressed (#27) with
no clean way to hide traces in Slack; this removes the dead plumbing.
The 🧠 "working on it" reaction is unchanged.
- Fix stale app.js header comment that claimed tool calls are
dispatched in lib/chat.js (tool calling was removed in #30).
- Update tests and docs (ARCHITECTURE, AGENTS, CLAUDE #27 lessons).
Co-Authored-By: Claude Opus 4.8 <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #25, #26, #27, #28, #29.
5 commits, one per issue. All passing 65 tests + lint on every step.
Per-issue summary
#25 — Native tool calling
lib/tools.jsregistry:generate_image,tell_dad_joke,state_asimovs_laws,show_help,start_dance_party,play_rickroll,play_tiktok.handleMessageruns a dispatch loop (capped at 5 iterations) — feeds tool results back as tool-role messages until the model returns plain text. Tool traffic is ephemeral; only the final assistant text is persisted.{ tools, toolCalls }shape to/from native wire format (Ollamatype:'function'wrappers; GeminifunctionDeclarations/functionCall/functionResponse).isImageRequestmatcher +IMAGE_REQUEST_GUIDANCEshim. Data now just generates the image.gemma4:31bonkepler.local: "Draw me a picture of a sehlat." → rich tool call, image uploaded. "What is 2+2?" → "Four." (no spurious tool calls.)#26 — Vision
extractMessageImagesinapp.jsfiltersmessage.filesfor image MIME types, fetches via Slack file URL with the bot token, base64-encodes.images: [{ mimeType, data }]on user turns. Ollama gets bare base64 strings; Gemini getsinlineDataparts.#27 — Thinking traces
handleMessagereturn shape changed fromstringto{ text, thinking? }.OLLAMA_THINKenv var (true|low|medium|high).thinkthrough and surfacesmessage.thinking.app.js::buildReplyPayloadrenders the trace as an italicized context block above the reply text. Trace clipped at 1200 chars.#28 — Reactions instead of "thinking" message
:brain:reaction added to the user's message instead of posting/deleting a thinking message.reactions:read+reactions:write.THINKING_MESSAGEenv var andthinkingMessagefrom deps/wiring.#29 — Thread-aware @-mention replies
{ bot_profile, bot_id, edited }—thread_tsandparent_user_idno longer cause Data to drop the message.say()calls in the direct-mention handler route through asayInThreadwrapper usingthread_ts: message.thread_ts || message.ts.buildToolsForplumbsthread_tsintochat.postMessageandfiles.uploadV2.Deploy notes
reactions:read,reactions:write) — Slack app needs a manifest re-sync + reinstall before reactions land.OLLAMA_THINK(off by default).THINKING_MESSAGE(was unused after the reaction switch).Test plan
bot-data.service./imageinstruction prompt.:brain:reaction during processing, then reply.@data hello→ expect reply inside a thread.@datainside an existing thread → expect reply continues the thread.OLLAMA_THINK=true→ expect italicized "Thinking: …" context block above the reply.